New Filter Structure based on Admissible Wavelet Packet Transform for Text-Independent Speaker Identification

نویسندگان

  • Mangesh S. Deshpande
  • Raghunath S. Holambe
چکیده

Identical acoustic features like Mel frequency cepstral Coefficients (MFCC)and Linear predictive cepstral coefficients (LPCC) are being widely used for different tasks like speech recognition and speaker recognition, whereas the requirement of speaker recognition is different than that of speech recognition. In MFCC feature representation, the Mel frequency scale is used to get a high resolution in low frequency region, and a low resolution in high frequency region. This kind of processing is good for obtaining stable phonetic information, but not suitable for speaker features that are located in high frequency regions. Further MFCC uses short time Fourier transform (STFT), which has fixed time-frequency resolution. Considering above facts, in this paper we have proposed a new filter structure based on admissible wavelet packet transform for text-independent speaker identification. Multiresolution capabilities of wavelet packet transform are used to derive the new features. The performance of the proposed features is evaluated using the most commonly used Gaussian mixture model (GMM) as well as the continuous density hidden Markov model (CDHMM) classifiers. Improved speaker identification rate is obtained using the proposed features compared to the MFCC and other Wavelet transform based features. Further the results show that CDHMM works better than the GMM for small number of mixture densities. Identification accuracy of 99.76% is achieved by conducting the experiments on TIMIT database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Identification Using Admissible Wavelet Packet Based Decomposition

Mel Frequency Cepstral Coefficient (MFCC) features are widely used as acoustic features for speech recognition as well as speaker recognition. In MFCC feature representation, the Mel frequency scale is used to get a high resolution in low frequency region, and a low resolution in high frequency region. This kind of processing is good for obtaining stable phonetic information, but not suitable f...

متن کامل

Optimized computational Afin image algorithm using combination of update coefficients and wavelet packet conversion

Updating Optimal Coefficients and Selected Observations Affine Projection is an effective way to reduce the computational and power consumption of this algorithm in the application of adaptive filters. On the other hand, the calculation of this algorithm can be reduced by using subbands and applying the concept of filtering the Set-Membership in each subband. Considering these concepts, the fir...

متن کامل

Text-independent Speaker Recognition

In this paper, text-independent speaker recognition method based on Wavelet Transform and mel-cepstrum is presented. The results of experiments point the best parameters of Wavelet Transform for speaker identification, and can be useful for design speaker identification systems. This kind method of person identification is useful in services such as banking by telephone, access authorization to...

متن کامل

Discrete Wavelet Transform & Linear Prediction Coding Based Method for Speech Recognition via Neural Network

In the proposed work, the techniques of wavelet transform (WT) and neural network were introduced for speech based text-independent speaker identification and Arabic vowel recognition. The linear prediction coding coefficients (LPCC) of discrete wavelet transform (DWT) upon level 3 features extraction method was developed. Feature vector fed to probabilistic neural networks (PNN) for classifica...

متن کامل

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009